Morphological Analysis of Historical Japanese Text

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corpus-based Japanese morphological analysis

The goal of this study is to improve corpus-based Japanese morphological analysis which is composed by word segmentation and part-of-speech (below POS) tagging. We divide the problem of Japanese morphological analysis into three subproblems: models for known word, models for unknown word and corpus maintenance schema. Firstly, we discuss Markov model-based approaches for known word processing. ...

متن کامل

Image Analysis for Historical Japanese Book Archives

This paper describes methods of image analysis for historical Japanese book archives with a dominant focus on character segmentation. The segmentation methodology includes stain and smear removal, binarization, character line extraction, and character extraction by region labeling with integration and separation techniques. The experimental results show that the proposed method can segment all ...

متن کامل

Morphological Analysis for Japanese Noisy Text based on Character-level and Word-level Normalization

Social media texts are often written in a non-standard style and include many lexical variants such as insertions, phonetic substitutions, abbreviations that mimic spoken language. The normalization of such a variety of non-standard tokens is one promising solution for handling noisy text. A normalization task is very difficult to conduct in Japanese morphological analysis because there are no ...

متن کامل

Corpus and Text Analysis of Spontaneous Japanese

There are three major parts of the “Spontaneous Speech: Corpus and Processing Technology” project; (1) compilation of large spontaneous speech corpus, (2) establishment of spoken language engineering based on the corpus, and (3) developing a prototype of a spoken language summarization system. This paper describes how we help to develop this large corpus, i.e., (1), using technology developed a...

متن کامل

Morphological Analysis and Diacritical Arabic Text Compression

Morphological analysis of Arabic words allows decreasing the storage requirements of the Arabic dictionaries, more efficient encoding of diacritical Arabic text, faster spelling and efficient Optical character recognition. All these factors allow efficient storage and archival of multilingual digital libraries that include Arabic texts. This paper presents a lossless compression algorithm based...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Natural Language Processing

سال: 2013

ISSN: 1340-7619

DOI: 10.5715/jnlp.20.727